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ATC TTTGTTCA6T TTACCTCAGG GCTATTATGA 33 



□ 



34 AATGAAATGA GATAACCAAT GTGAAAGTCC TATAAACTGT ATAGCCTCCA TTCGGATGTA 93 
94 TGTCTTTGGC AGGATGATAA AGAATCAGGA AGAAGGAGTA TCCACGTTAG CCAAGTGTCC 153 
154 AGGCTGTGTC TGCTCTTATT TTAGTGACAG ATGTTGCTCC TGACAGAAGC TATTCTTCAG 213 
214 GAAACATCAC ATCCAATATG GTAAATCCAT CAAACAGGAG CTAAGAAACA GGAATGAGAT 273 
274 GGGCACTTGC CCAAGGAAAA ATGCCAGGAG AGCAAATAAT GATGAAAAAT AAACTTTTCC 333 



334 CTTTGTTTTT AATTTCAGGA AAAAATGATG AGGACCAAAA TCAATGAATA AGGAAAACAG 393 

(Prl.FPIII) CCTG AAAATGAATA AGAAA 

394 CtCAGAAAAA AGATGTTKCC AAATTGGTAA TTAAGTA r m GTTCQTT^GG AAGAGACCTC 453 

1 " ' * ' VPR/GR-MHTV) f GTTCTTTTGG AA' 

(SSRE) GAGACC 

454 CATGTGAGCT TGATGGGAAA ATGGGAAAAA CGTCAAAAGC ATGATCTGAT CAGATCCCAA 513 

514 AGTGGATTAT TATTTTAAAA ACCAGATGGC ATCACTCTGG GGAGGCAAGT TCAGGAAGGT 573 

574 CATGTTAGCA AAGGACATAA CAATAACAGC AAAATCAAAA TTCCGCAAAT GCAGGAGGAA 633 
CCTTTTAG-A AAGGACAAAA CAGAATG (nGRE-PRL) 

634 AATGGGGACT GGGAAAGCTT TCATAACAGT GATTAGGCAG TTGACCATGT TCGCAACACC 693 

694 TCCCCGTCTA TACCAGGGAA CACAAAAATT GACTGGGCTA AGCCTGGACT TTCAAGGGAA 753 

GCCTGGACT GTC (CBE-P53) 

754 ATATGAAAAA CTGAGAGCAA AACAAAAGAC ATGGTTAAAA GGCAACCAGA ACATTGTGAG 813 

ATTTTTCTGA TTGGTTAAAA GT (NFEi ) 

814 CCTTCAAAGC AGCAGTGCCC CTCAGCAGGG ACCCTGAGGC ATTTGCCTTT AGGAAGGCCA 873 

G ACCCTGAGGC T (KTF.l-CS) 

874 GTTTTCTTAA GGAATCTTAA GAAACTCTTG AAAGATCATG AATTTTAACC ATTTTAAGTA 933 

934 TAAAACAAAT ATGCGATGCA TAATCAGTTT AGACATGGGT CCCAATTTTA TAAAGTCAGG 993 

(PRE-lysozyme) AGGCCGT 

994 CATACAAGGA TAACGTGTCC CAGCTCCGGA TAGGTCAGAA ATCATTAGAA ATCACTGTGT 1053 
GATCCAAGGA GCAGAAGTTC CAGCTATGGT CAG (GRE-hMT) GG TACACTGTGT 

1054 CCCCATCCTA ACTTTTTCAG AATGATCTGT CATAGCCCTC ACACACAGGC . CCGATGTGTC 1113 
CCT 



1114 TGACCTACAA CCACATCTAC AACCCAAGTG CCTCAACCAT TGTTAACGTG TCATCTCAGT 1173 
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1174 AGGTCCCATT ACAAATGCCA CCTCCCCTGT GCAGCCCATC CCGCTCCACA GGAAGTCTCC 1233 

1234 CCACTCTAGA CTTCTGCATC ACGATGTTAC AGCCAGAAGC TCCGTGAGGG TGAg£tCTg]i293 

(SSRE) GGTCtC 

1294 TGTCTTACAC CTACCTGTAT GCTCTACACC TGAGCTCACT GCAACCTCTG CCTCCCAGGT 1353 

1354 TCAAGCAATT CTCCTGTCTC AGCCTCCCGC GTAGCTGGGA CTACAGGCGC ACGCCCGGCT 1413 

C AGCCCCCCGC GCAGC (ETF.EGFR) 



1414 AATTTTTGTA TTGTTAGTAG AGATGGGGTT TCACCATATT AGCCCGGCTG GTCTTGAACT 1473 
Alu Repeat Region CCATATT AGG (SRE-cFos) 

1474 CCTGACCTCA GGTGATCCAC CCACCTCAGC CTCCTAAAGT GCTGGGATTA CAGGCATGAG 1533 

1534 TCACCGCGCC CGGCCAAGGG TCAGTGTTTA ATAAGGAATA ACTTGAATGG TTTACTAAAC 1593 

1594 CAACAGGGAA ACAGACAAAA GCTGTGATAA TTTCAGGGAT TCTTGGGATG GGGAATGGTG 1653 

1654 CCATGAGCTG CCTGCCTAGT CCCAGACCAC TGGTCCTCAT CACTTTCTTC CCTCATCCTC 1713 

1714 ATTTTCAGGC TAAGTTACCA TTTTATTCAC CATGCTTTTG TGGTAAGCCT CCACATCGTT 1773 

1774 ACTGAAATAA GAGTATACAT AAACTAGTTC CATTTGGGGC CATCTGTGTG TGTGTATAGG 1833 
GTTTACAT AAAC ( VBP - vi tel ) GG 

1834 GGAGGAGGGC ATACCCCAGA GACTCCTTGA AGCCCCCGGC AGAGGTTTCC TCTCCAGCTG 1893 
GGAKGAGG (MalT-CS) 

1894 GGGGAGCCCT GCAAGCACCC GGGGTCCTGG GTGTCCTGAG CAACCTGCCA GCCCGTGCCA 1953 

1954 CTGGTTGTTT TGTTATCACT CTCTAGGGAC CTGTTGCTTT CTATTTCTGT GTGACTCGTT 2013 

2014 CATTCATCCA GGCATTCATT GACAATTTAT TGAGTACTTA TATCTGCCAG ACACCAGAGA 2073 

2074 CAAAATGGTG AGCAAAGCAG TCACTGCCCT ACCTTCGTGG AGGTGACAGT TTCTCATGGA 2133 

2134 AGACGTGCAG AAGAAAATTA ATAGCCAGCC AACTTAAACC CAGTGCTGAA AGAAAGGAAA 2193 

GCGTGAC CGGAGCTGAA AGAAAGGAAC 

2194 TAAACACCAT CTTGAAGAAT TGTGCGCAGC ATCCCTTAAC AAGGCCACCT CCCTAGCGCC 2253 
AC (ERE-c.vitel) 

2254 CCCTGCTGCC TCCATCGTGC CCGGAGGCCC CCAAGCCCGA GTCTTCCAAG CCTCCTCCTC 2313 

2314 CATCAGTCAC AGCGCTGCAG CTGGCCTGCC TCGCTTCCcG TGAATCGTCC TGGTGCATCT 2373 

AGCAG CTGGC (NF-mutagen) 

2374 GAGCTGGAGA CTCCTTGGCT CCAGGCTCCA GAAAGGAAAT GGAGAGGGAA ACTAGTCTAA 2433 

A GAAAGGGAAA GGA (PRF-myc) 

2434 CGGAGAATCT GGAGGGGACA GTGTTTCCTC AGAGGGAAAG GGGCCTCCAC GTCCAGGAGA 2493 
ACCCGGTACA CTGTGTCCTC CCGCT (GRE-hMT. I la) 
CC CTTTGGGCCA ATGTGTCCTG AGGGGA (GRE-hGH) 
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2494 ATTCCAGGAG GTGGGGACTG CAGGGAGTGG GGACGCTGGG GCTGAGCGGG TGCTGAAAGG 2553 

CTGG GGAGCCTGGG GA (AP.2-SV40) 

2554 CAGGAAGGTG AAAAGGGCAA GGCTGAAGCT GCCCAGATGT TCAGTGTTGT TCACGGGGCT 2613 

2614 GGGAGTTTTC CGTTGCTTCC TGTGAGCCTT 7TTATCTTTT CTCTGCTTGG AGGAGAAGAA 2673 
CT CGTTGCTTCG AG (HSTF-hsp70) 

2674 GTCTATTTCA TGAAGGGATG CAGTTTCATA AAGTCAGCTG TTAAAATTCC AGGGTGTGCA 2733 

A 

2734 TGGG TTTTC C TTCACGAAGG CCTTTATTTA ATGGGAATAT AGGAAGCGAG CTCATTTCCT 2793 
TGGGTTTTTG (SBF. yeast) 

2794 AGGCCGTTAA TTCACGGAAG MGTGACTGG AGTCTTTTCT TTCATGTCTT CTGGGCAACT 2853 

2854 ACTCAGCCCT GTGGTGGACT TGGCTTATGC AAGACGGTCG AAAACCTTGG AATCAGGAGA 2913 

2914 CTCGGTTTTC TTTCTGGTTC TGCCATTGGT TGGCTGTGCG ACCGTGGGCA AGTGTCTCTC 2973 
C TTTCTGGTTT TGCAG (NF. 1-bi thorax) 
( NF - MHC II/) CCATTGGT T 

2974 CTTCCCTGGG CCATAGTCTT CTCTGCTATA AAGACCCTTG CAGCTCTCGT GTTCTGTGAA 3033 

3034 CACTTCCCTG TGATTCTCTG TGAGGGGGGA TGTTGAGAGG GGAAGGAGGC AGAGCTGGAG 3093 

3094 CAGCTGAGCC ACAGGGGAGG TGGAGGGGGA CAGGAAGGCA GGCAGAAGCT GGGTGCTCCA 3153 

3154 TCAGTCCTCA CTGATCACGT CAGACTCCAG GACCGAGAGC CACAATGCTT CAGGAAAGCT 2943 

2944 CAATGAACCC AACAGCCACA TTTTCCTTCC CTAAGCATAG ACAATGGCAT TTGCCAATAA 3273 

3274 CCAAAAAGAA TGCAGAGACT AACTGGTGGT AGCTTTTGCC TGGCATTCAA AAACTGGGCC 3333 
GMGTGACT AACTG ( PEA . 1 - Pol yoma ) 

3334 AGAGCAAGTG GAAAATGCCA GAGATTGTTA AACTnTCAC CCTGACCAGC ACCCCACGCA 3393 

3394 GCTCAGCAGT GACTGCTGAC AGCACGGAGT GACCTGCAGC GCAGGGGAGG AGAAGAAAAA 3453 

C AGGTCAGAGT GACCTG (ERE.2-Vitel . ) 

3454 GAGAGGGATA GTGTATGAGC AAGAAAGACA GATTCATTCA AGGGCAGTGG GAATTGACCA 3513 

3514 CAGGGATTAT AGTCCACGTG ATCCTGGGTT CTAGGAGGCA GGGCTATATT GTGGGGGGAA 3573 

(GRE-FLV) CGGGATAC CGAGAGAACA GGGCTATAGG 

3574 AAAATCAGTT CAAGGGAAGT CGGGAGACCT GATTTCTAAT ACTATATTTT TCCTTTACAA 3633 

GAGACC (SSRE) 

3634 GCTGAGTAAT TCTGAGCAAG TCACAAGGTA GTAACTGAGG CTGTAAGATT ACTTAGTTTC 3693 

(ICS-MTII/ HLA-DR/)AGTTTC 

3694 TCCTTATTAG GAACTCTTTT TCTCTGTGGA GTTAGCAGCA CAAGGGCAAT CCCGTTTCTT 3753 
TCCTCT 

3754 TTAACAGGAA GAAAACATTC CTAAGAGTAA AGCCAAACAG ATTCAAGCCT AGGTCTTGCT 3813 
3814 GACTATATGA TTGGTTTTTT GAAAAATCAT TTCAGCGATG TTTACTATCT GATTCAGAAA 3873 
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GG 
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OZ70*t 
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CTGTGTTTCT 
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3994 


CCACTCTGGA 


GGTGAGTCTG 


CCAGGGCAGT TTGGAAATAT 


TTACTTCACA 


AGTATTGACA 


4053 


4054 


CTGTTGTTGG 


TATTAACAAC 


ATAAAGTTGC TCAAAGGCAA 


TCATTATTTC 


AAGTGGCTTA 


4113 


4114 


AAGTTACTTC 


TGACAGIIII 


GGTATATTTA TTGGCTATTG 


CCATTTGCTT 


TTTGI 1 1 1 1 1 


4173 








( NF . 1 - HCMV ) TTGGCTATTG 


GCCA 


cm 




4174 


CTCTTTGGGT 


TTATTAATGT 


AAAGCAGGGA TTATTAACCT 


ACAGTCCAGA 


AAGCCTGTGA 


4233 



CTCTTT (ISGF2) 

4234 ATTTGAATGA GGAAAAAATT ACATTnTGT TTTTACCACC TTCTAACTAA ATTTAACATT 4293 

(Zn binding) --- 

4294 TTATTCCATT GCGAATAGAG CCATAAACTC AAAGTGGTAA TAACAGTACC TGTGATTTTG 4353 

4354 TCATTACCAA TAGAAATCAC AGACATTTTA TACTATATTA CAGTTGTTGC AGATACGTTG 4413 

(CAP-galO) ATTTA TTCCATGTCA CACTTTTCGC A 

4414 TAAGTGAAAT ATTTATACTC AAAACTACTT TGAAATTAGA CCTCCTGCTG GATCTTGTTT 4473 

TTACTC A' (AP-1) 

4474 TTAACATATT AATAAAACAT GTTTAAAATT TTGATATTTT GATAATCATA TTTCATTATC 4533 

GAT GTTTAAAAT (PRL-FPII) 

4534 ATTTGTTTCC TTTGTAATCT ATATTTTATA TATTTGAAAA CATCTTTCTG AGAAGAGTTC 4593 

(GRE-MuRFV) TGTTTTTCTG AGAACATCAG 

4594 CCCAGATtTC ACCAATGAGG TTCTTGGCAT GCACACACAC AGAGTAAGAA CTGATTTAGA 4653 

CCAGATCTC ACCATCATTAT (nGRE) CACACACAC A (CACA) 
CTCTGG GGACAC AGAGTAGGG (AP.l-TGFb) 

4654 GGCTAACATT GACATTGGTG CCTGAGATGC AAGACTGAAA TTAGAAAGTT CTCCCAAAGA 4713 

(GC2) GATGCT GATGGATAAT TTAGAAGCTT CTCCCACA 



4714 TACACAGTTG 



TAAAGCT AGGGGTGAGG GGGGAAATCT GCCGC TTCTA TAGGAATGCT 4773 

(PEA.3)AGGAA GGT 



4774 CTCCCTGGAG CCTGGTAGGG TGCTGTCCTT GTGTTCTGGC TGGCTGTTAT TTTTCTCTGT 4833 
CTC V (SSRE) MIR Repeat Region 

4834 CCCTGCTACG TCTTAAAGGA CTTGTTTGGA TCTCCAGTTr. CTAGCATAGT Gr.CTGGr.AfA 4893 

GGA CTTGTTTGTT CT (GRE- rTAT- I I ) TGGGCACA 
GCAAAAAGGA TCTATTTGGA A (GRE-MMTV) 

4894 GTGCAGGTTC TCAATGAGTT TGCAGAGTGA ATGGAAATAT AAACTAGAAA TATATCCTTG 4953 
GTGCCAA (NF-1 (HNF-l)C TGTGAAATAT TAACTAAA 

4954 TTGAAATCAG CACACCAGTA GTCCTGGTGT AAGTGTGTGT ACGTGTGTGT GTGTGTGTGT 5013 
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5014 GTGT6T6TGT AAAACCAGGT GGAGATATAG GAACTATTAT TGGGGTATGG GTGCATAAAT 5073 

cat/ reverse cat box 

5074 TGGGATGTTC TTTTTAAAAA GAAACTCCAA ACAGACTTCT GGAAGGTTAT TTTCTAAGAA 5133 
(1/2GRE)TGTTC T (HSTF) GAAACTTCT GGAATATTCC CGAACTTTC 

C CTTTTAGAAA GGA---CAAA ACAGAATG(nGRE-Prl) 

5134 TCTTGCTGGC AGCGTGAAGG CAACCCCCCT GTGCACAGCC CCACCCAGCC TCACGTGGCC 5193 
(1/2 TREJAGG CAA T-CC CCAGGCTCCC -CAG(AP.2-SV40) 

GGAGAGCC.CC (NF-KB) 

1 

5194 ACCTCTGTCT TCCCCCATGA AGGGCTGGCT ClT.CA GTATA TATAAAC CTC TCTGGAGCTC 5253 

tata box GGTC TCS (SSRE) 

5254 GGGCATGAGC CAGCAAGGf*f* ACCCATCCAG GCACCTCTCA GCACAGC 5300 

Start Sites 



p 
1=1= 
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1474 CCTGACCTCA GGTGATCCAC 

1534 TCACCGCGCC CGGCCAAGGG 

1594 CAACAGGGAA ACAGACAAAA 

1654 CCATGAGCTG CCTGCCTAGT 

1714 ATTTTCAGGC TAAGTTACCA 

1774 ACTGAAATAA GAGTATACAT 

1834 GGAGGAGGGC ATACCCCAGA 

1894 GGGGAGCCCT GCAAGCACCC 

□ 1954 CTGGTTGTTT TGTTATCACT 
.n 

iU 2014 CATTCATCCA GGCATTCATT 

lis 

)4 2074 CAAAATGGTG AGCAAAGCAG 

IB 2134 AGACGTGCAG AAGAAAATTA 
2194 TAAACACCAT CTTGAAGAAT 

S 2254 CCCTGCTGCC TCCATCGTGC 

la 

U 2314 CATCAGTCAC AGCGCTGCAG 

2 2374 GAGCTGGAGA CTCCTTGGCT 

2434 CGGAGAATCT GGAGGGGACA 
2494 ATTCCAGGAG GTGGGGACTG 
2554 CAGGAAGGTG AAAAGGGCAA 
2614 GGGAGTTTTC CGTTGCTTCC 
2674 GTCTATTTCA TGAAGGGATG 
2734 TGGGTTTTCC TTCACGAAGG 
2794 AGGCCGTTAA TTCACGGAAG 
2854 ACTCAGCCCT GTGGTGGACT 
2914 CTCGGTTTTC TTTCTGGTTC 
2974 CTTCCCTGGG CCATAGTCTT 
3034 CACTTCCCTG TGATTCTCTG 
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CCACCTCAGC CTCCTAAAGT GCTGGGATTA CAGGCATGAG 1533 
TCAGTGTTTA ATAAGGAATA ACTTGAATGG TTTACTAAAC 1593 
GCTGTGATAA TTTCAGGGAT TCTTGGGATG GGGAATGGTG 1653 
CCCAGACCAC TGGTCCTCAT CACTTTCTTC CCTCATCCTC 1713 
TTTTATTCAC CATGCTTTTG TGGTAAGCCT CCACATCGTT 1773 
AAACTAGTTC CATTTGGGGC CATCTGTGTG TGTGTATAGG 1833 
GACTCCTTGA AGCCCCCGGC AGAGGTTTCC TCTCCAGCTG 1893 
GGGGTCCTGG GTGTCCTGAG CAACCTGCCA GCCCGTGCCA 1953 
CTCTAGGGAC CTGTTGCTTT CTATTTCTGT GTGACTCGTT 2013 
GACAATTTAT TGAGTACTTA TATCTGCCAG ACACCAGAGA 2073 
TCACTGCCCT ACCTTCGTGG AGGTGACAGT TTCTCATGGA 2133 
ATAGCCAGCC AACTTAAACC CAGTGCTGAA AGAAAGGAAA 2193 
TGTGCGCAGC ATCCCTTAAC AAGGCCACCT CCCTAGCGCC 2253 
CCGGAGGCCC CCAAGCCCGA GTCTTCCAAG CCTCCTCCTC 2313 
CTGGCCTGCC TCGCTTCCCG TGAATCGTCC TGGTGCATCT 2373 
CCAGGCTCCA GAAAGGAAAT GGAGAGGGAA ACTAGTCTAA 2433 
GTGTTTCCTC AGAGGGAAAG GGGCCTCCAC GTCCAGGAGA 2493 
CAGGGAGTGG GGACGCTGGG GCTGAGCGGG TGCTGAAAGG 2553 
GGCTGAAGCT GCCCAGATGT TCAGTGTTGT TCACGGGGCT 2613 
TGTGAGCCTT TTTATCTTTT CTCTGCTTGG AGGAGAAGAA 2673 
CAGTTTCATA MGTCAGCTG TTAAAATTCC AGGGTGTGCA 2733 
CCTTTATTTA ATGGGAATAT AGGAAGCGAG CTCATTTCCT 2793 
AAGTGACTGG AGTCTTTTCT TTCATGTCTT CTGGGCAACT 2853 
TGGCTTATGC AAGACGGTCG AAAACCTTGG AATCAGGAGA 2913 
TGCCATTGGT TGGCTGTGCG ACCGTGGGCA AGTGTCTCTC 2973 
CTCTGCTATA AAGACCCTTG CAGCTCTCGT GTTCTGTGAA 3033 
TGAGGGGGGA TGTTGAGAGG GGAAGGAGGC AGAGCTGGAG 3093 
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3094 CAGCTGA6CC ACAGGGGAGG TGGAGGGGGA CAGGAAGGCA GGCAGAAGCT GGGTGCTCCA 3153 
3154 TCAGTCCTCA CTGATCACGT CAGACTCCAG GACCGAGAGC CACAATGCTT CAGGAAAGCT 2943 
2944 CAATGAACCC AACAGCCCACA TTTTCCTTCC CTAAGCATAG ACAATGGCAT TTGCCAATAA 3273 
3274 CCAAAAAGAA TGCAGAGACT AACTGGTGGT AGCTTTTGCC TGGCATTCAA AAACTGGGCC 3333 
3334 AGAGCAAGTG GAAAATGCCA GAGATTGTTA AACTTTTCAC CCTGACCAGC ACCCCACGCA 3393 
3394 GCTCAGCAGT GACTGCTGAC AGCACGGAGT GACCTGCAGC GCAGGGGAGG AGAAGAAAAA 3453 
3454 GAGAGGGATA GTGTATGAGC AAGAAAGACA GATTCATTCA AGGGCAGTGG GMTTGACCA 3513 
3514 CAGGGATTAT AGTCCACGTG ATCCTGGGTT CTAGGAGGCA GGGCTATATT GTGGGGGGAA 3573 
3574 AAAATCAGTT CAAGGGAAGT CGGGAGACCT GATTTCTAAT ACTATATTTT TCCTTTACAA 3633 
3634 GCTGAGTAAT TCTGAGCAAG TCACAAGGTA GTAACTGAGG CTGTAAGATT ACTTAGTTTC 3693 
3694 TCCTTATTAG GAACTCTTTT TCTCTGTGGA GTTAGCAGCA CAAGGGCAAT CCCGTTTCTT 3753 
3754 TTAACAGGAA GAAAACATTC CTAAGAGTAA AGCCAAACAG ATTCAAGCCT AGGTCTTGCT 3813 
3814 GACTATATGA TTGGTTTTTT GAAAAATCAT TTCAGCGATG TTTACTATCT GATTCAGAAA 3873 
3874 ATGAGACTAG TACCCTTTGG TCAGCTGTAA ACAAACACCC ATTTGTAAAT GTCTCAAGTT 3933 
3934 CAGGCTTAAC TGCAGAACCA ATCAAATAAG AATAGAATCT TTAGAGCAAA CTGTGTTTCT 3993 
3994 CCACTCTGGA GGTGAGTCTG CCAGGGCAGT TTGGAAATAT TTACTTCACA AGTATTGACA 4053 
4054 CTGTTGTTGG TATTAACAAC ATAAAGTTGC TCAAAGGCAA TCATTATTTC AAGTGGCTTA 4113 
4114 AAGTTACTTC TGACAGTTTT GGTATATTTA TTGGCTATTG CCATTTGCTT TTTGIIIIII 4173 ' 
4174 CTCTTTGGGT TTATTAATGT AAAGCAGGGA TTATTAACCT ACAGTCCAGA AAGCCTGTGA 4233 
4234 ATTTGAATGA GGAAAAAATT ACGTTTTTAT TTTTACCACC TTCTAACTAA ATTTAACATT 4293 
4294 TTATTCCATT GCGAATAGAG CCATAAACTC AAAGTGGTAA TAAGAGTACC TGTGATTTTG 4353 
4354 TCATTACCAA TAGAAATCAC AGACATTTTA TACTATATTA CAGTTGTTGC AGGTACGTTG 4413 
4414 TAAGTGAAAT ATTTATACTC AAAACTACTT TGAAATTAGA CCTCCTGCTG GATCTTGTTT 4473 
4474 TTAACATATT AATAAAACAT GTTTAAAATT TTGATATTTT GATAATCATA TTTCATTATC 4533 
4534 ATTTGTTTCC TTTGTAATCT ATATTTTATA TATTTGAAAA CATCTTTCTG AGAAGAGTTC 4593 
4594 CCCAGATTTC ACCAATGAGG TTCTTGGCAT GCACACACAC AGAGTAAGAA CTGATTTAGA 4653 
4654 GGCTAACATT GACATTGGTG CCTGAGATGC AAGACTGAAA TTAGAAAGTT CTCCCAAAGA 4713 
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4714 TACACAGTTG TTTTAAAGCT AGGGGTGAGG GGGGAAATCT GCCGCTTCTA TAGGAATGCT 4773 
4774 CTCCCTGGAG CCTGGTAGGG TGCTGTCCTT GTGTTCTGGC TGGCTGTTAT TTTTCTCTGT 4833 
4834 CCCTGCTACG TCTTAAAGGA CTTGTTTGGA TCTCCAGTTC CTAGCATAGT GCCTGGCACA 4893 
4894 GTGCAGGTTC TCAATGAGTT TGCAGAGTGA ATGGAAATAT AAACTAGAAA TATATCTTTG 4953 
4954 TTGAAATCAG CACACCAGTA GTCCTGGTGT AAGTGTGTGT ACGTGTGTGTGTGT GTGTGTGTGT5017 
5018 GTGTGTGTGT AAAACCAGGT GGAGATATAG GAACTATTAT TGGGGTATGG GTGCATAAAT 5077 
5078 TGGGATGTTC TTTTTAAAAA GAAACTCCAA ACAGACTTCT GGAAGGTTAT TTTCTAAGAA 5137 
jfj 5138 TCTTGCTGGC AGCGTGAAGG CAACCCCCCT GTGCACAGCC CCACCCAGCC TCACGTGGCC 5197 

!y 5198 ACCTCTGTCT TCCCCCATGA AGGGCTGGCT CCCCAGTATA TATAAACCTC TCTGGAGCTC 5257 

|2 5258 GGGCATGAGC CAGCAAGGCC ACCCATCCAG GCACCTCTCA GCACAGC 5304 

: ; 

b 

!:i FIG. 2D 
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APPPPTATTA 

Abbbb 1 A 1 1 A 


TPAAATPAAA 

1 bAAAIbAAA 


TP.AP.ATAAPP 
1 bMbM 1 MMbb 


51 


A ATPTP A A AP 
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1 bb 1 A 1 AAAb 
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4U1 


AAAAP.ATCTT 
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TPPAAATTGR 
1 LbMMM 1 1 uU 


TAATTAARTA 
1 MM 1 1 MMU 1 M 


TTTRTTPPTT 

1 M U M 1 1 


RGRAARARAP 

UUUMMUMUMU 




PTPPATP.TP.A 
b 1 bbM 1 b 1 bM 


Ub 1 1 UM 1 uuu 


AAAATCGPiAA 
MMMM 1 UUUMM 


AAAPRTPAAA 
MMMLU 1 UMMM 


ARPATRATPT 

MUUM 1 UM 1 U 1 


01)1 


PATPAPATPP 
bA 1 bAbA I bb 


PAAAP,TftP,AT 
bMMMb 1 bbM 1 


TATTAI MIA 
IMI IMI 1 1 IM 


AAAAPPARAT 
MMMMLUMUM 1 


RRPATPAPTP 

UUUM 1 l»nU 1 U 


ool 


1 bbbuAbbbA 


AfTITPAfiPlAA 
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GGTPATGTTA 
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TAAPAATAAP 

1 MMUMM 1 MMU 


OU1 


AP.PAAAATPA 
AbbMMMM 1 bM 


AAATTPPHPA 
MMM 1 1 bbbbM 


AATPiPAGGAP, 
MM 1 ubMubMu 


GAAAATRRGR 
UMMMM 1 UUUU 


APTRRRAAAR 

ML 1 UUUMMMU 


ART 
001 


PTTTPATAAP 
bill bM 1 MMb 


ACTGATTAGP, 
MO 1 un 1 1 MUU 
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/01 
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TAAGAAACTC 


901 


TTGAAAGATC 


ATGAAI 1 1 IA 


ACCAIIIIAA 


GTATAAAACA 


AATATGCGAT 


951 


GCATAATCAG 


TTTAGACATG 


GGTCCCAATT 


TTATAAAGTC 


AGGCATACAA 


1001 


GGATAACGTG 


TCCCAGCTCC GGATAGGTCA 


GAAATCATTA GAAATCACTG 



1051 TGTCCCCATC CTAACTTTTT CAGAATGATC TGTCATAGCC CTCACACACA 

1101 GGCCCGATGT GTCTGACCTA CAACCACATC TACAACCCAA GTGCCTCAAC 

1151 CATTGTTAAC GTGTCATCTC AGTAGGTCCC ATTACAAATG CCACCTCCCC 

1201 TGTGCAGCCC ATCCCGCTCC ACAGGAAGTC TCCCCACTCT AGACTTCTGC 

1251 ATCACGATGT TACAGCCAGA AGCTCCGTGA GGGTGAGGGT CTGTGTCTTA 
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CACCTACCTG 
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GGTTCAAGCA 
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CGCACGCCCG 
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1701 


TTCCCTCATC 


rlj 


1751 


TTGTGGTAAG 


ry 


1801 


TTCCATTTGG 




1851 


AGAGACTCCT 


t; 


1901 


CCTGCAAGCA 


□ 
M 


1951 


CCACTGGTTG 


i ; 

-ESS 


2001 


TGTGTGACTC 


ip 


2051 


TTATATCTGC 




2101 


CCTACCTTCG 




2151 


TTAATAGCCA 




2201 


CATCTTGAAG 




2251 


GCCCCCTGCT 




2301 


AAGCCTCCTC 




2351 


CCGTGAATCG 




2401 


CCAGAAAGGA 




2451 


ACAGTGTTTC 




2501 


GAGGTGGGGA 




2551 


AGGCAGGAAG 




2601 


TGTTCACGGG 



TATGCTCTAC 


11/23 

ACCTGAGCTC 


ACTGCAACCT 


CTGCCTCCCA 


ATTCTCCT6T 


CTCAGCCTCC 

O 1 wnUww 1 vy vy 


CGCGTAGCTG 


GGACTACAGG 


GCTAAI 1 1 1 1 

vjvv i nn iiiii 


GTATTGTTAG 

uini i vj i i nvj 


TAGAGATGGG 

i nunvin i vjvj vj 


GTTTCACCAT 

VJ 1 1 1 ono wn I 


CTGGTCTTGA 

Vv t vjvj i vs i i un 


ACTCCTGACC 


TCAGGTGATC 
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nVJ 1 Uv 1 VJVJVJ** 
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GAGTCACCGC 
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n I »v»o i i unn 
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nvj i wwwnunw 
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GCCAACTTAA 
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AATTGTGCGC 
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TCCTGTGAGC 
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TTTCTCTGCT 




2701 


ATAAAGTCAG 




2751 


AGGCCTTTAT 




2801 


TAATTCACGG 




2851 


ACTACTCAGC 




2901 


TGGAATCAGG 




2951 


GCGACCGTGG 


G 


3001 


ATAAAGACCC 


. #1 


3051 


CTGTGAGGGG 


IV 


3101 


GCCACAGGGG 


IS 


3151 


CCATCAGTCC 


09 

J c 


3201 


CTTCAGGAAA 


i; 


3251 TAGACAATGG 


0 


3301 GGTAGCI 1 1 1 


!«* 


3351 CCAGAGATTG 


*0 


3401 AGTGACTGCT 




3451 AAAGAGAGGG 
3501 TGGGAATTGA 
3551 GCAGGGCTAT 
3601 CCTGATTTCT 
3651 AAGTCACAAG 
3701 TAGGAACTCT 
3751 CI 1 1 IAACAG 
3801 CCTAGGTCTT 
3851 ATGTTTACTA 




3901 


TAAACAAACA 




3951 CCAATCAAAT 
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TGGAGGAGAA GAAGTCTATT TCATGAAGGG ATGCAGTTTC 
CTGTTAAAAT TCCAGGGTGT GCATGGGTTT TCCTTCACGA 
TTAATGGGAA TATAGGAAGC GAGCTCATTT CCTAGGCCGT 
AAGAAGTGAC TGGAGTCTTT TCTTTCATGT CTTCTGGGCA 
CCTGTGGTGG ACTTGGCTTA TGCAAGACGG TCGAAAACCT 
AGACTCGGTT TTCTTTCTGG TTCTGCCATT GGTTGGCTGT 
GCAAGTGTCT CTCCTTCCCT GGGCCATAGT CTTCTCTGCT 
TTGCAGCTCT CGTGTTCTGT GAACACTTCC CTGTGATTCT 
GGATGTTGAG AGGGGAAGGA GGCAGAGCTG GAGCAGCTGA 
AGGTGGAGGG GGACAGGAAG GCAGGCAGAA GCTGGGTGCT 
TCACTGATCA CGTCAGACTC CAGGACCGAG AGCCACAATG 
GCTCAATGAA CCCAACAGCC ACATTTTCCT TCCCTAAGCA 
CATTTGCCAA TAACCAAAAA GAATGCAGAG ACTAACTGGT 
GCCTGGCATT CAAAAACTGG GCCAGAGCAA GTGGAAAATG 
TTAAACTTTT CACCCTGACC AGCACCCCAC GCAGCTCAGC 
GACAGCACGG AGTGACCTGC AGCGCAGGGG AGGAGAAGAA 
ATAGTGTATG AGCAAGAAAG ACAGATTCAT TCAAGGGCAG 
CCACAGGGAT TATAGTCCAC GTGATCCTGG GTTCTAGGAG 
ATTGTGGGGG GAAAAAATCA GTTCAAGGGA AGTCGGGAGA 
AATACTATAT TTTTCCTTTA CAAGCTGAGT AATTCTGAGC 
GTAGTAACTG AGGCTGTAAG ATTACTTAGT TTCTCCTTAT 
TTTTCTCTGT GGAGTTAGCA GCACAAGGGC AATCCCGTTT 
GAAGAAAACA TTCCTAAGAG TAAAGCCAAA CAGATTCAAG 
GCTGACTATA TGATTGGTTT TTTGAAAAAT CATTTCAGCG 
TCTGATTCAG AAAATGAGAC TAGTACCCTT TGGTCAGCTG 
CCCATTTGTA AATGTCTCAA GTTCAGGCTT AACTGCAGAA 
AAGAATAGAA TCTTTAGAGC AAACTGTGTT TCTCCACTCT 



FIG.3C 
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4001 GGAGGTGAGT CTGCCAGGGC AGTTTGGAAA TATTTACTTC ACAAGTATTG 
4051 ACACTGTTGT TGGTATTAAC AACATAAAGT TGCTCAAAGG CAATCATTAT 
4101 TTCAAGTGGC TTAAAGTTAC TTCTGACAGT TTTGGTATAT TTATTGGCTA 
4151 TTGCCATTTG CTTTTTGTTT TTTCTCTTTG GGTTTATTAA TGTAAAGCAG 
4201 GGATTATTAA CCTACAGTCC AGAAAGCCTG TGAATTTGAA TGAGGAAAAA 
4251 ATTACATTTT TGTTTTTACC ACCTTCTAAC TAAATTTAAC ATTTTATTCC 
4301 ATTGCGAATA GAGCCATAAA CTCAAAGTGG TAATAACAGT ACCTGTGATT 
4351 TTGTCATTAC CAATAGAAAT CACAGACATT TTATACTATA TTACAGTTGT 
4401 TGCAGATACG TTGTAAGTGA AATATTTATA CTCAAAACTA CTTTGAAATT 
4451 AGACCTCCTG CTGGATCTTG TTTTTAACAT ATTAATAAAA CATGTTTAAA 
4501 ATTTTGATAT TTTGATAATC ATATTTCATT ATCATTTGTT TCCTTTGTAA 
4551 TCTATATTTT ATATATTTGA AAACATCTTT CTGAGAAGAG TTCCCCAGAT 
4601 TTCACCAATG AGGTTCTTGG CATGCACACA CACAGAGTAA GAACTGATTT 
4651 AGAGGCTAAC ATTGACATTG GTGCCTGAGA TGCAAGACTG AAATTAGAAA 
4701 GTTCTCCCAA AGATACACAG TTGTTTTAAA GCTAGGGGTG AGGGGGGAAA 
4751 TCTGCCGCTT CTATAGGAAT GCTCTCCCTG GAGCCTGGTA GGGTGCTGTC 
4801 CTTGTGTTCT GGCTGGCTGT TATTTTTCTC TGTCCCTGCT ACGTCTTAAA 
4851 GGACTTGTTT GGATCTCCAG TTCCTAGCAT AGTGCCTGGC ACAGTGCAGG 
4901 TTCTCAATGA GTTTGCAGAG TGAATGGAAA TATAAACTAG AAATATATCC 
4951 TTGTTGAAAT CAGCACACCA GTAGTCCTGG TGTAAGTGTG TGTACGTGTG 
5001 TGTGTGTGTG TGTGTGTGTG TGTAAAACCA GGTGGAGATA TAGGAACTAT 
5051 TATTGGGGTA TGGGTGCATA AATTGGGATG TTCTTTTTAA AAAGAAACTC 
5101 CAAACAGACT TCTGGAAGGT TATTTTCTAA GAATCTTGCT GGCAGCGTGA 
5151 AGGCAACCCC CCTGTGCACA GCCCCACCCA GCCTCACGTG GCCACCTCTG 
5201 TCTTCCCCCA TGAAGGGCTG GCTCCCCAGT ATATATAAAC CTCTCTGGAG 
5251 CTCGGGCATG AGCCAGCAAG GCCACCCATC CAGGCACCTC TCAGCACAGC 5300 
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1 AGAGCTTTCCAGAGGAAGCCTCACCMGCCTCTGCAATGA6GTTCTTCTGTGCACGTTGC 60 
61 TGCAGCTTTGGGCCTGAGATGCCAGCTGTCCAGCTGCTGCTTCTGGCCTGCCTGGTGTGG 120 
121 GATGTGGGGGCCAGGACAGCTCAGCTCAGGAAGGCCAATGACCAGAGTGGCCGATGCCAG 180 
181 TATACCTTCAGTGTGGCCAGTCCCAATGAATCCAGCTGCCCAGAGCAGAGCCAGGCCATG 240 
241 TCAGTCATCCATAACTTACAGAGAGACAGCAGCACCCAACGCTTAGACCTGGAGGCCACC 300 
301 AAAGCTCGACTCAGCTCCCTGGAGAGCCTCCTCCACCAATTGACCTTGGACCAGGCTGCC 360 
361 AGGCCCCAGGAGACCCAGGAGGGGCTGCAGAGGGAGCTGGGCACCCTGAGGCGGGAGCGG 420 
421 GACCAGCTGGAAACCCAAACCAGAGAGTTGGAGACTGCCTACAGCAACCTCCTCCGAGAC 480 
O 481 AAGTCAGTTCTGGAGGAAGAGAAGAAGCGACTAAGGCAAGAAAATGAGAATCTGGCCAGG 540 

s 

!P 5 541 AGGTTGGAAAGCAGCAGCCAGGAGGTAGCAAGGCTGAGAAGGGGCCAGTGTCCCCAGACC 600 

M 601 CGAGACACTGCTGGGGCTGTGCCACCAGGCTCCAGAGAAG 

03 

\m (intron #1) gtaagaatgcagagtggggggactct 

gagttcagcaggtgatatggctcgtagtgacctgctacaggcgctccaggcctccctgccctttctccta 
,, gagactgcacagctagcacaagacagatgaattaaggaaagcacacgatcaccttcaagtattacta 

gtaatttagctcctgagagcttcatttagattagtggttcagagttcttgtgcccctccatgtcag 

l n f ---- Intron I -10 Kb - --- 

!"" aaggtaggGacattgccctgcaatttataatttatgaggtgttcaattatggaattgtcaaatattaaca 
^ aaagtagagagactacaatgaactccaatgtagccataactcaggcccaactgttatcagcacagtcc 
M aatcatgttttatctttccttctctgacccccaacccatccccagtccttatctaaaatcaaatatcaaaca 
'.Q ccatactctttgggagcctatttatttagttagttagttttcagacagagtttctttcttgttcccaagctgg 
=.Q - agtacaatagtgtagtctcggctaacagcaatctccccctccttggttcaagcaattctectgcctcagtc 

tcccaagaagctgggattatagacacctgccaccacatccagctaatttttttgtgttttagaaaagaca 
gggtttcaccatgttggccaggctggtttcgaactcctgacctcaggtgatccgcctgcctcggcctccca 
aagtgctgggattacaggcatgagccaccacgcctggccggcagcctatttaaatgtcatcctcaacat 
agtcaatccttgggccattttttcttacagtaaaattttgtctctttcttttaatcag 



(exon #2) TT TCT ACG TGG AAT TTG GAC 

661 ACT TTG GCC TTC CAG GAACTG AAG TCC GAG CTA ACT GAAGTTCCT GCTTCC CGA ATT TTG 720 
721 AAG GAG AGC CCA TCT GGC TAT CTC AGG AGT GGA GAG GGA GAC ACCG 



(intron #2) 

gtatgaagttaagtttcttcccttttgtgcccacgtggtctttattcatgtctagtgctgtgttcagagaa 
tcagtatagggtaaatgcccacccaagggggaaattaacttccctgggagcagagggaggggagga 
gaagaggaacagaactctctctctctctctgttacccttgt Intron II - 3 kb 

FIG.3E 
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